-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Add vsx register support for ppc inline asm, and implement preserves_flag option #146949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
a258f60
to
b63ecb2
Compare
Some changes occurred in compiler/rustc_codegen_gcc |
This comment has been minimized.
This comment has been minimized.
For the gcc backend, what is the preferred way to test these changes? |
For now, we don't have gcc assembly or codegen tests, so I would leave this as is. |
Ping. |
extended_asm.add_clobber("cc"); | ||
match asm_arch { | ||
InlineAsmArch::PowerPC | InlineAsmArch::PowerPC64 => { | ||
let clobbers = ["cr0", "cr1", "cr2", "cr3", "cr4", "cr5", "cr6", "cr7", "xer"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear whether this is the approach we want to go with for PPC assembly. Currently we have the ability to explicitly mark cr* and xer as clobbers with out("cr0") _
. Is this something that we want to preserve? Do we want these to always be marked as clobbered by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The choice here is very conservative. I think ISA is fairly explicit about which instructions clobber cr
s and/or xer
. It's mostly cr0 (gpr .
ops), cr6 (vsx/vmx .
ops), and xer (the something and carry ops, and some shift ops).
For ergonomics, limiting this to cr0
and xer
would capture the most common cases. Likewise if preserve_flags
is not used when it should, it's not clobbering callee-save cr
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most of the FP ops clobber cr1
if you use the .
variant.
| PowerPC | `reg_nonzero` | `r[3-12]`, `r[14-28]` | `b` | | ||
| PowerPC | `freg` | `f[0-31]` | `f` | | ||
| PowerPC | `vreg` | `v[0-31]` | `v` | | ||
| PowerPC | `vsreg | `vs[0-63]` | `vs` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| PowerPC | `vsreg | `vs[0-63]` | `vs` | | |
| PowerPC | `vsreg | `vs[0-63]` | `wa` | |
b63ecb2
to
57b0d79
Compare
Some changes occurred in src/doc/unstable-book/src/compiler-flags/branch-protection.md cc @rust-lang/project-exploit-mitigations, @rcvalle |
This comment has been minimized.
This comment has been minimized.
It seems your rebase dragged a few other commits with it... |
57b0d79
to
47d8404
Compare
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
extended_asm.add_clobber("cc"); | ||
match asm_arch { | ||
InlineAsmArch::PowerPC | InlineAsmArch::PowerPC64 => { | ||
let clobbers = ["cr0", "xer"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably include cr1
for the FP .
instructions and cr6
for the vector instructions. The rest of the cr
fields should need explicit instruction arguments so are much less likely to be forgotten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a dotted fp or vector op is almost always by explicit choice. I think in all cases, a non-dotted form available. The dotted fp usage is only useful when checking for fp exceptions.
xer
and cr0
are used in common instructions with only dotted forms available, or always clobber xer[ca], subfic
andi.
srad
come to mind.
}, | ||
// VSX is a superset of altivec. | ||
Self::vsreg => types! { | ||
vsx: F32, F64, VecI8(16), VecI16(8), VecI32(4), VecI64(2), VecF32(4), VecF64(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically this should also include 128-bit integer/fp types and f16
since there are instructions for operating on those, but we can leave those out for now.
vs63 : v31; | ||
} | ||
// f0-f31 (vsr0-vsr31) and v0-v31 (vsr32-vsr63) do not conflict. | ||
// For more detail, see ISA 3.1, Book I, Section 7.2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// For more detail, see ISA 3.1, Book I, Section 7.2. | |
// For more detail, see Power ISA v3.1C, Book I, Section 7.2.1.1 and 7.2.1.2 on page 486 (522): | |
// https://files.openpower.foundation/s/9izgC5Rogi5Ywmm |
Where supported, VSX is a 64x128b register set which encompasses both the floating point and vector registers. In the type tests, xvsqrtdp is used as it is the only two-argument vsx opcode supported by all targets on llvm. If you need to copy a vsx register, the preferred way is "xxlor xt, xa, xa".
47d8404
to
9d51dd0
Compare
I'm inclined to only have this cover Also we may want to disallow specifying |
I think we want to allow both. As far as I can tell, the As for |
Implemented preserves_flags similar to the `cc` pseudo-register on gcc and clang. The gcc inline documentation doesn't actually state what this does on powerpc* targets, but inspection of the source shows it is equivalent to condition register field `cr0`.
9d51dd0
to
05ad94b
Compare
This should address the last(?) missing pieces of inline asm for ppc:
Explicit VSX register support. ISA 2.06 (POWER7) added a 64x128b register overlay extending the fpr's to 128b, and unifies them with the vmx (altivec) registers. Implementations details within gcc/llvm percolate up, and require using the
x
template modifier. I have updated the inline asm to implicitly include this for vsx arguments which do not specify it.Support for the gcc codegen backend is still a todo.Implement the
preserves_flags
option. All ABI's, and all ISAs store their flags incr
, and the carry bit lives insidexer
. The other status registers hold sticky bits or control bits which do not affect branch instructions.There is some interest in the e500 (powerpcspe) port. Architecturally, it has a very different FP ISA, and includes a simd extension called SPR (which is not IBM's cell SPE). Notably, it does not have altivec/fpr/vsx registers. It also has an SPE accumulator register which its ABI marks as volatile, but I am not sure if the compiler uses it.